vTaiwan Uber Analysis

Polis is an open source wiki-survey platform for rapid, scalable, open ended feedback, in which participants submit short comments which are sent out semi-randomly to other participants to vote on (by clicking agree, disagree or pass). Polis uses statistical algorithms to find patterns of consensus and opinion groups.

This report looks at the data generated in an engagement run by the government of Taiwan in August of 2015 concerning how Uber should be regulated in the nation. People's opinions, as reflected in this data, were then fed into a series of in person consultations with stakeholders, as part of the nation's vTaiwan deliberative process, and points of consensus were used to craft legislation which was broadly viewed as fair to all parties (including the traditional Taxi companies and the citizens of Taiwan).

Basic statistics

ParticipantsGroupsCommentersCommentsVotesVotes / participant (avg)
1238233984798538.76

We can take these votes and arrange them into a matrix, where rows correspond to participants and columns correspond to statements. This allows us to think of participants as having positions in high dimensional space (dimensionality equal to the number of comments).

compiled vega png

Dimensionality reduction & opinion groups

While the above visualization may be impressive, it's not particularly useful as far as understanding how participants opinions relate to each other. To better understand this, we can apply a dimensionality reduction algorithm, which allows us to capture as much of the variance within the data as we can within a lower dimensional space. Specifically, reducing to 2-dimensions allows us to plot participants locations in relation to each other in an opinion space, where participants are close together if they tend to agree, and further apart if they tend to disagree. Here, we're also coloring according to a K-means clustering of the participants into opinion groups, which lets us ask questions about what's important to different groups, and better understand the opinion landscape.

compiled vega png

Below, we can see the proportion of total variance explained by the x and y axes (the first two principal components) in the plot above:

[0.21629    0.06215295]

The sharp decline in explained variance from the 1st to the 2nd principal component reflects a very strong pro/con division in opinions in relation to Uber. However, the fact that over 70% percent of the variance is not captured by these first two components suggests that there is still some structure not being revealed here.

compiled vega png